A compiler toolkit for array-based languages targeting CPU/GPU hybrid systems

نویسنده

  • Laurie Hendren
چکیده

This paper presents a compiler toolkit that addresses two important emerging challenges: (1) effectively compiling dynamic array-based languages such as MATLAB, Python and R; and (2) effectively utilizing a wide range of rapidly evolving hybrid CPU/GPU architectures. The toolkit provides: a high-level IR specifically designed to express a wide range of arraybased computations and indexing modes; Velociraptor, a CPU/GPU code generator and runtime library; and RaijinCL, a portable autotuning GPU library for key BLAS routines. A compiler developer uses the toolkit by generating VelociraptorIR for key parts of an input program, and using Velociraptor to automatically generate CPU/GPU code. The toolkit leverages OpenCL and LLVM for GPU and CPU code generation respectively, and can thus be used for a wide variety of target architectures. To demonstrate different possible uses of the toolkit, the paper presents a proof-of-concept CPU/GPU Python compiler, and a GPU extension of a MATLAB JIT.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

VRIR specifications

Array-based languages, such as MATLAB and Python’s NumPy library, usually share some common features. For example, both MATLAB and Python have matrix multiplication built-in to the language and offer support for array slicing operators. Details of the exact semantics of some array operators may vary slightly, but the basic ideas are substantially similar. Languages like MATLAB and especially Py...

متن کامل

An OpenMP Programming Toolkit for Hybrid CPU/GPU Clusters Based on Software Unified Memory

Recently, hybrid CPU/GPU cluster has drawn much attention from the researchers of high performance computing because of amazing energy efficiency and adaptable resource exploitation. However, the programming of hybrid CPU/GPU clusters is very complex because it requires users to learn new programming interfaces such as CUDA and OpenCL, and combine them with MPI and OpenMP. To address this probl...

متن کامل

Ultra-Fast Image Reconstruction of Tomosynthesis Mammography Using GPU

Digital Breast Tomosynthesis (DBT) is a technology that creates three dimensional (3D) images of breast tissue. Tomosynthesis mammography detects lesions that are not detectable with other imaging systems. If image reconstruction time is in the order of seconds, we can use Tomosynthesis systems to perform Tomosynthesis-guided Interventional procedures. This research has been designed to study u...

متن کامل

Neon: A Domain-Specific Programming Language for Image Processing

Neon is a high-level domain-specific programming language for writing efficient image processing programs which can run on either the CPU or the GPU. End users write Neon programs in a C# programming environment. When the Neon program is executed, our optimizing code generator outputs human-readable source files for either the CPU or GPU. These source files are then added to the user source tre...

متن کامل

Can PCM Benefit GPU? Reconciling Hybrid Memory Design with GPU Massive Parallelism for Energy Efficiency

In recent studies, phase changing memory (PCM) has shown promising energy efficiency for systems with a modest level of parallelism. But it remains an open question whether it can benefit GPU-like massively parallel systems. This work conducts the first systematic investigation into this question. It empirically shows that contrary to the promising results shown before on CPU, the previous desi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012